-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Project Arbex (Arbitrary Expressions) #4777
Conversation
docs/style-spec/expressions.md
Outdated
|
||
###Decision: | ||
- `["case", cond1: Boolean, result1: T, cond2: Boolean, result2: T, ..., cond_m, result_m: T, result_otherwise: T] -> T` | ||
- `["match", x: T, a_1: T, y_1: U, a_2: T, y_2: U, ..., a_m: T, y_m: U, y_else: U]` - `a_1`, `a_2`, ... must be _literal_ values of type `T`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we generalize this so that multiple literal values can be mapped to the same output expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simplest syntax for this would be:
["match", x: T, [a_1: T, a_2: T, ... a_N: T], y_a: U, [b_1: T, b_2: T, ..., b_M], y_b: U, ..., y_else: U]
Only bad thing about this is that it would take us away from the purity of every array in the JSON representing an expression (since we could consider the curve
types like ["linear"]
to be expressions of an private CurveType
type). Is maintaining such syntax purity worthwhile? Some ways we could do that:
-
["match", x: T, [ "vector", a_1: T, a_2: T, ... a_N: T], y_a: U, [ "vector", b_1: T, b_2: T, ..., b_M], y_b: U, ..., y_else: U]
. But allowing a"vector"
constructor may not be a great idea, since we don't have a way for users to specify arbitrary types. -
We could use
"json_array"
instead -- but that's not great type-wise, since it specifically producesVector<Value>
, notVector<T>
. -
A specialized
"match_case"
expression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use the simplest syntax. It's fine if "match"
is a special form. (case
is in scheme too.)
docs/style-spec/expressions.md
Outdated
- `Boolean` | ||
- Literal: `true` or `false` | ||
- `Color` | ||
- `JSONObject` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just call it Object
.
docs/style-spec/expressions.md
Outdated
- TODO: without type inference, 0-length arrays and vectors can't be typed | ||
- `Value`: A "variant" type representing the set of possible values retrievable from a feature's `properties` object (`Null | String | Number | Boolean | JSONObject | Vector<Value>`) | ||
- `Error`: a subtype of all other types. Used wherever an expression is unable to return the appropriate supertype. Carries diagnostic information such as an explanatory message and the expression location where the Error value was generated. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about enumerated types? I think we'll find that leaving them stringly-typed is less than ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jfirebaugh Can you say more about why you think we’ll want these over strings? specifically: are there cases where such types would be useful for checking what a subexpression evaluates to?
So far, the use cases I can think of for enums are for checking the final output of an expression that's being used for an enum-typed style property like symbol-placement
. If it's true that this is the main reason for them, then the question is which is simpler/better: adding enums to the type system here, or treating the validation of style property values as a separate step?
Since there are, e.g., numeric properties like *-opacity
that can't really be validated via types (at least not normal ones), seems like we're gonna have the extra validation step anyway, which makes me lean towards the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be wrong, just a gut feeling that we'll run into issues in practice, or as we implement it (especially in native). I don't have any concrete examples.
docs/style-spec/expressions.md
Outdated
###Lookup: | ||
- `["get", obj: JSONObject, key: String ] -> Value` | ||
- `["has", obj: JSONObject, key: String ] -> Boolean` | ||
- `["at", arr: JSONArray, index: Number] -> Value` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is redundant to the polymorphic version on the next line, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, yep thanks.
docs/style-spec/expressions.md
Outdated
- min, max | ||
|
||
###String: | ||
- `["concat", expr1: T, expr2: U, …] -> String` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
["concat", expr: String, …] -> String
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking it might be okay, as a convenience, for concat
to implicitly convert all of its arguments, on the premise that this should always be possible without error as long as the value isn't an Error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason to support implicit conversion for concat
but not other expressions (+
, &&
, up/downcase
, ...)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a hugely compelling one... just the combination of:
- Combining string literals, string property values, and numeric property values will be a very common use case (i.e. the equivalent of using
{}
token replacement fortext-field
- Automatic conversions to
String
will never fail for values that aren't alreadyError
, so the explicit cast doesn't seem to me to add any value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there by any advantage to having a string formatting function instead of concat?
On one hand it would add complexity because it would add function-specific syntax. On the other hand you could handle the entire "this is what I want the displayed text to be" in a single function, which might be more UI-friendly.
} | ||
``` | ||
|
||
## Property Expressions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This document refers exclusively to “expressions”, but the title of this PR refers to “arbitrary expressions”. To avoid confusion, we should settle on one name for the feature and eliminate the other name. As catchy as “Arbex” sounds, I prefer simply “expressions” for these reasons:
- Studio users, who may be unfamiliar with the style specification’s evolution, may perceive “arbitrary” as a negative trait.
- “Arbitrary expressions” would seem to imply a non-arbitrary expression feature that doesn’t exist.
- “Expressions” aligns with common usage on the iOS platform and in other kinds of software.
/cc @dasulit
Ideally this will no longer be necessary after this happens: mapbox#4777
b75efe5
to
c1c29cb
Compare
@anandthakker Could we consider using a (formal) specification for the expression syntax? Something like ABNF would let us use existing parser generators like ANTLR and others to generate platform specific bindings and also parts of the core implementations and tests. |
@ivovandongen Yeah, I've definitely been pondering this. There are some things that would make it challenging to specify expressions fully and formally:
Beyond that, it's also just more machinery in general. As such, when @jfirebaugh and I discussed this yesterday we landed on the notion that, at least for now, we could just use the module in GL JS that defines each expression (found at |
Ideally this will no longer be necessary after this happens: mapbox#4777
throw new ParsingError(key, `Expected string, but found ${typeof name} instead`); | ||
|
||
if (context.definitions[name]) | ||
throw new ParsingError(key, `"${name}" is reserved, so it cannot not be used as a "let" binding.`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is getting compiled to function (name, ...) { ... }
, we need to either prohibit JS keywords as well, or sanitize names, say by prefixing them with _
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 yep, tracking that on my list of tail work but forgot to add it to the top of this ticket. Adding now.
docs/style-spec/expressions.md
Outdated
- `[ "e" ] -> Number` | ||
|
||
### Variable binding: | ||
- `[ "let", name1: String, e1, name2: String, e2, ..., e_result ]` - Bind expression `e1` to the string `name1`, `e2` to `name2`, etc., before evaluating `e_result`. The bound expressions may be referenced within `e_result` with `[ name1 ]`, `[ name2 ]`, etc. (E.g.: `["let", "a", 1, "b", ["number", ["get", "blah"]], [ "+", ["a"], ["b"] ]`.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we want to use [name1]
as the syntax for this -- that would mean that any new expression form we want to add in a future version would potentially break existing expressions that used that name as an identifier. Better to have an explicit form like ["var", name]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤦♂️ yeah. ["ref", name1]
? ["var", name1]
? Less concise, but actually might allow us to avoid some special logic for var reference expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than extending LambdaExpression
, the Expression
subclasses for match
, case
, curve
, and coalesce
should directly implement Expression
, providing custom parse
, typecheck
, and compile
methods, and using structured member data rather than a generic args
. For example, the member data for MatchExpression
should be something like:
input: Expression;
cases: Array<[Array<string | number>, Expression]>;
otherwise: Expression;
This will reduce the complexity of LambdaExpression
and other parts of the codebase, likely removing the need for typename
, nargs
, isGeneric
, and so on. At the same time, using more strictly typed member data will reveal edge cases that are not currently handled/tested. As an experiment, I sketched this out for match
. Edge cases:
- No arguments (
["match"]
) - One argument (
["match", foo]
) - Two arguments (
["match", foo, z]
) - Null or boolean key (
["match", foo, null, x, z]
) -- should we allow this? It's better written withcase
IMO. - Object or array key (
["match", foo, {...}, x, [[...]], y, z]
) - Mixed key types (
["match", foo, 0, x, "0", y, z]
,["match", foo, [0, "0"], x, z]
). This is currently allowed, which surprised me, but I guess it is useful for munging datasets that are inconsistently typed. - Mismatched input/key types (
["match", 0, "0", x, "1", y, z]
) - Mismatched result types (
["match", foo, 0, 0, 1, "1", false]
) - Repeated keys (
["match", foo, 0, x, 0, y, z]
) - Non-integer keys (
["match", foo, 0.333, x, z]
)
Continuing to look at how to remove Besides
Another thing I noticed is that
Note that there's no "expected" type input to this judgement: if you need to check that t3 is a subtype of some expected type, than you do that externally to checking |
@jfirebaugh 👍 I came to similar tentative conclusions re:
Oh yeah, good point -- this was originally the case to help make type inference of generics more straightforward, but now that we're eliminating that need, we can simplify it 🙌 |
context.key, | ||
'The "zoom" expression may only be used as the input to a top-level "curve" expression.' | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should happen in function/index.js
rather than here, because this restriction on [zoom]
only applies to expressions used for style property values (as opposed to, say, filters)
2d82f1a
to
3bc2561
Compare
cab3b58
to
db7c15c
Compare
75fcd51
to
73fff34
Compare
@mourner @jfirebaugh posted some benchmarks in the top of the ticket - looks like DDS is about 2x slower with expressions, but hopefully quick profiling will yield some low-hanging fixes that will reclaim some of that. |
@anandthakker thanks! The frame-duration 6ms to 8ms change is also worth looking into. |
#5150 adds to frame-duration as well (because it's doing collision detection during rendering). It's mainly an issue for zooms/maps with very dense line labeling, which is kind of the worst case for collision detection. We should try to keep an eye on how the two PRs interact. |
@mourner @ChrisLoer After 6f4a6c4 the impact on |
6f4a6c4
to
f6e716b
Compare
src/style-spec/function/compile.js
Outdated
|
||
return { | ||
result: 'success', | ||
function: fn.bind(evaluationContext()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might be worth trying a custom bind
or an arrow fn — calls of natively binded functions are expensive in many browsers
} else if (expectedType.kind === 'Value') { | ||
typeError = true; | ||
} else if (expectedType.kind !== 'Array') { | ||
typeError = typeof value !== expectedType.kind.toLowerCase(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: toLowerCase
will be called on almost every get
; might be preferable to make a lookup table for kinds instead
} catch (e) { | ||
if (thunks.length === 0) throw e; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we do away with a simple for
loop here? Here, we create an array from arguments, and then pop values one by one from it when we can do away with just iterating over them, which is much cheaper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this is actually obsolete code after df245c1 - will remove.
2ab1313
to
d68949a
Compare
d68949a
to
9ee7362
Compare
Ports mapbox/mapbox-gl-js#4777 (and its several follow-ups)
I see that issue #4820 Support contains, begins with, and ends with filter option for strings is referenced in this package. But I cannot figure out the syntax for the expression to filter strings that begin with a given value. Can anyone help me with it? |
Intent
Design and implement a style specification syntax for defining "style functions" using arbitrary expressions. Such functions should be usable for:
Background:
Initial discussion: #4715
Development of working draft: #4754
Important use cases that the final design should cover:
Plan
bezier-easing
dependency in favor of pre-existingunitbezier
match
expression to support multiple keys per output valueBenchmarks
Benchmark scores:
Looking at the dds-specific benchmark being added in #5207, which exercises tile layout with many zoom-and-property functions, looks like about a 2x hit using expressions compared to existing functions:update: After a couple quick fixes, it's more like 1.5x instead of 2x:
@mapbox/gl @mapbox/studio @mapbox/cartography-cats @mapbox/support